Optimized combined-clustering methods for finding replicated criminal websites
نویسندگان
چکیده
To be successful, cybercriminals must figure out how to scale their scams. They duplicate content on new websites, often staying one step ahead of defenders that shut down past schemes. For some scams, such as phishing and counterfeitgoods shops, the duplicated content remains nearly identical. In others, such as advanced-fee fraud and online Ponzi schemes, the criminal must alter content so that it appears different in order to evade detection by victims and law enforcement. Nevertheless, similarities often remain, in terms of the website structure or content, since making truly unique copies does not scale well. In this paper, we present a novel optimized combined clustering method that links together replicated scam websites, even when the criminal has taken steps to hide connections. We present automated methods to extract key website features, including rendered text, HTML structure, file structure and screenshots. We describe a process to automatically identify the best combination of such attributes to most accurately cluster similar websites together. To demonstrate the method’s applicability to cybercrime, we evaluate its performance against two collected datasets of scam websites: fake-escrow services and high-yield investment programs (HYIPs). We show that our method more accurately groups similar websites together than does existing general-purpose consensus clustering methods.
منابع مشابه
Reeling in Big Phish with a Deep MD5 Net
Phishing continues to grow as phishers discover new exploits and attack vectors for hosting malicious content; the traditional response using takedowns and blacklists does not appear to impede phishers significantly. A handful of law enforcement projects — for example the FBI's Digital PhishNet and the Internet Crime and Complaint Center (ic3.gov) — have demonstrated that they can collect phish...
متن کاملThe Deadliest Catch: Reeling In Big Phish With a Deep MD5 Net
Phishing continues to grow as phishers discover new exploits and attack vectors for hosting malicious content; the traditional response using takedowns and blacklists does not appear to impede phishers significantly. A handful of law enforcement projects — for example the FBI's Digital PhishNet and the Internet Crime and Complaint Center (ic3.gov) — have demonstrated that they can collect phish...
متن کاملThe Application of Combined Fuzzy Clustering Model and Neural Networks to Measure Valuably of Bank Customers
Currently, acquisition of resources in banks is subject to attraction of the resources of banking customers. Meanwhile, the Bank’s valuable customers are one of the best resources to make profit for banks. Several different models are introduced for evaluation of profitability of the customers; but most of them are classical models and they are unable to evaluate the customers in complete and o...
متن کاملThe Application of Combined Fuzzy Clustering Model and Neural Networks to Measure Valuably of Bank Customers
Currently, acquisition of resources in banks is subject to attraction of the resources of banking customers. Meanwhile, the Bank’s valuable customers are one of the best resources to make profit for banks. Several different models are introduced for evaluation of profitability of the customers; but most of them are classical models and they are unable to evaluate the customers in complete and o...
متن کاملAn Optimized Firefly Algorithm based on Cellular Learning Automata for Community Detection in Social Networks
The structure of the community is one of the important features of social networks. A community is a sub graph which nodes have a lot of connections to nodes of inside the community and have very few connections to nodes of outside the community. The objective of community detection is to separate groups or communities that are linked more closely. In fact, community detection is the clustering...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- EURASIP J. Information Security
دوره 2014 شماره
صفحات -
تاریخ انتشار 2014